WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 

C12N 15/60, 15/53, C12P 7/18, C12N 
9/04, 9/B8, 13774, 15/79, 1/21, 1/19 



Al 



(11) International Publication Number: WO 96/35795 

(43) International Publication Date: 14 November 1996 (14.1 1.96) 



(21) International Application Number: PCT/US96/06163 

(22) International Filing Date: 2 May 1996 (02X35.96) 



(30) Priority Data: 

08/440,377 



12 May 1995 (12.05.95) 



US 



(60) Parent Application or Grant 
(63) Related by Continuation 
US 

Filed on 



08/440,377 (OP) 
12 May 1995 (12.05.95) 



(71) Applicant (for all designated States except US)z E J. DU PONT 

DE NEMOURS AND COMPANY [US/US]; 1007 Market 
Street, Wilmington, DE 19898 (US). 

(72) Inventors; and 

(75) Inventora/Applicants (/or USonfyb NAGARAJAN, Vasantha 
[US/US]; 13 Dickinson Lane, Wilmington, DE 19807-3131 
(US). NAKAMURA, Charles, Edwin [US/US]; 2 Mount 
Vernon Drive, daymont, DE 19703-2422 (US). 

(74) Agent: FLOYD, Linda, A.; EJL du Pont de Nemours and 
Company, Legal Patent Records Center, 1007 Market Street, 
Wilmington, DE 19898 (US). 



(81) Designated States: AL, AU, BB, BO, BR, CA, CN, CZ, EE, 
GE, HU, IS, JP, KP, KR, LK, LR, LT, LV, MG, MK, MN, 
MX, NO, NZ, PL, RO, SO, SI, SK, TR, TT, UA, US, UZ, 
VN, ARIPO patent (KB, LS, MW, SD, SZ, UG), Eurasian 
patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), European 
patent (AT, BE, CH, DE, DK, ES, PI, PR, GB, GR, IB, IT, 
LU, MC, NL, PT, SE), OAPI patent (BP, BJ, CP, CO, CI, 
CM, GA, GN, ML, MR, NE, SN, TD, TG). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 

With an indication in relation to a deposited 
microorganism furnished tinder Rule 13 bi * separately 
from the description. 

Date of receipt by the International Bureau: 

4 June 1996 (04.06.96) 



(54) Title: PRODUCTION OF 1,3-PROPANEDIOL FROM GLYCEROL BY RECOMBINANT BACTERIA EXPRESSING RECOMBI- 
NANT DIOL DEHYDRATASE 



(57) Abstract 

A process is provided for the bioconversion of glycerol to 1,3-propanedlol in which genes from a bacteria known to possess a diol 
dehydratase enzyme for 1 ^-propanediol degradation are cloned into a bacterial host and the host is grown in the presence of glycerol; 
expression of the foreign genes in the host cell facilitates the enzymatic conversion of glycerol to 1,3-propanediol which is isolated from 
the culture. 



BEST AVAILABLE COPY 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identity States party to the PCT on the front pages of pamphlets publishing international 
applications under the PCT. 



AM 


Armenia 


OB 


United Kingdom 


MW 


Malawi 




AT 


Austria 


6B 


Georgia 


MX 


Mexico 




AO 


AuetraBa 


GN 


Guinea 


NB 


Niger 




BB 


Barbados 


OR 


Greece 


NL 


Netherlands 




BE 


Belgium 


HU 


Hungary 


KG 


Norway 




BF 


Burkina Faso 


IB 


Ireland 


NZ 


NewZealam 




BG 


Bulgaria 


IT 


Bary 


PL 


Poland 




BJ 




JP 


Japan 


PT 


Portugal 




BR 


BrmzU 


KB 


Kenya 


RO 


Romania 




BY 


Belarus 


KG 


Kyrgystan 


RU 


Russian Fedt 


smtkm 


CA 


Canada 


KP 


Democratic People's Republic 


SD 


Sudan 




CF 


Central African Republic 




of Korea 


SE 


Sweden 




CO 


Congo 


KR 


Republic of Korea 


5G 


Singapore 




CH 


Switzerland 


KZ 


Kazakhstan 


SI 


Slovenia 




a 


COte d'tvoire 


LI 




SK 


Slovakia 




CM 


Cameroon 


LK 


Sri Lanka 


SN 


Senegal 




CN 


China 


LR 


Liberia 


sz 


Swaziland 




cs 


Czechoslovakia 


LT 


Lithuania 


TD 


Chad 




cz 


Czech Republic 


LU 


Luxembourg 


TO 


Togo 




DB 


Oennany 


LV 


Latvia 


TJ 


Tajikistan 




DK 


Deumaric 


MC 


Monaco 


TT 


Trinidad end 


Tobago 


BB 


Gitoni a 


MD 


Republic of Moldova 


UA 


Ukraine 




BS 


Spain 


MG 


Madagascar 


UC 


Uganda 




F1 


Finland 


ML 


Mali 


US 


United State) 


j of America 


FR 


Fran oe 


. MN 


Mongolia 


uz 






GA 


Gabon 


MR 


Mauritania 


VN 


Viet Nam 





WO 96/35795 PCTVUS96/06163 



TTTLE 

PRODUCTION OF 1, 3 -PROPANEDIOL FROM GLYCEROL 
BY RECOMBINANT BACTERIA EXPRESSING 
RECOMBINANT DIOL DEHYDRATASE 
5 FTETjD OF INVENTION 

This invention relates to a process for the 
bioconversion of glycerol to l f 3 -propanediol by 
recombinant bacteria harboring a foreign gene encoding 
a diol dehydratase, 

10 RTVrTCnROTTND 

l r 3-propanediol is a monomer having potential 
utility in the production of polyester fibers and the 
manufacture of polyurethanes and cyclic compounds. 

A variety of chemical routes to 1, 3-propanediol 
15 are known. For example, 1, 3-propanediol may be 

prepared from ethylene oxide and a catalyst in the 
presence of phosphine, water, carbon monoxide, 
hydrogen and an acid; by the catalytic solution phase 
hydration of acrolein followed by reduction; or from 
20 hydrocarbons such as glycerol, reacted in the presence 
of carbon monoxide and hydrogen over periodic table 
group VIII catalysts. Although it is possible to 
generate 1, 3-propanediol by these methods, they are 
expensive and generate waste streams containing 
25 environmental pollutants. 

Biological routes to 1, 3-propanediol are known 
which utilize feedstocks produced from renewable 
resources. For example, bacterial strains able to 
convert glycerol into 1, 3-propanediol are found e.g., 
30 in the species Klebsiella, Cltrobacter, Clostridium, 
and Lactobacillus. In these bacteria, glycerol can 
enter either an oxidative or reductive pathway. 
Oxidation of glycerol results in the conversion of 
glycerol to dihydroxyacetone (DHA) by glycerol 
35 dehydrogenase and the DHA is phosphorylated by an 

adenosine triphosphate <ATP> dependent kinase to yield 
dihydroxyacetone phosphate (DHAP) which enters the 
glycolytic pathway in the cell. Reduction of glycerol 
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is accomplished by an initial isomerization and 
dehydration catalyzed by glycerol dehydratase to yield . 
3-hydroxypropionaldehyde which is further reduced by 
1, 3-propanediol :NAD+ oxidoreductase to yield 
5 1, 3-propanediol, a dead end cellular metabolite. The 
expression of at least the first two enzymes involved 
in the oxidative pathway as well as the two enzymes 
involved in the reductive pathway in iC. pneumoniae are 
coordinately regulated. The four enzyme system is 
10 functionally linked where the production of 

1, 3-propanediol from glycerol is dependent on the 
presence of the reductants supplied by the DHA to DHAP 
pathway . 

The genes responsible for the conversion of 

15 glycerol to 1, 3-propanediol have been isolated and are 
all encompassed by the dha regulon. In order to make 
use of the potential advantages of higher protein 
expression and growth rate of recombinant bacteria, 
several attempts have been made to express the dha 

20 regulon as heterologous genes in E. coli. For 

example, the dha regulon from Cltrobacter <Daniel et 
al., FEMS Microbiol. Lett., 100, 281, (1992)) and 
Klebsiella (Tong et al., Appl* Environ. Microbiol., 
57, 3541 f (1991); have been expressed in E. coli and 

25 have been shown to convert glycerol to 

1, 3-propanediol. The expression of the dha regulon in 
recombinant bacteria offers potential advantages over 
wild type production of 1, 3-propanediol • The genes 
involved in the dha regulon provide both the enzymes 

30 and the necessary reductants needed for the efficient 
conversion of glycerol to 1, 3-propanediol, However, 
simultaneous overexpression of both glycerol 
dehydrogenase and glycerol dehydratase results in some 
of the glycerol being converted to DHA. It would be 

35 advantageous to convert all the glycerol to 

1, 3-propanediol by expressing only the reductive 
pathway enzymes while providing a different substrate 
for the generation of reductant. A preferred system 
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would provide for a more efficient use of the glycerol 
substrate while maintaining high yields of diol 
product . 

It has long been known that a number of bacteria 
5 are capable of using 1, 2-propanediol as a sole carbon 
source- It is thought that this ability is conferred 
by a specific vitamin Bi 2 dependent diol dehydratase 
which is encoded by the pdu operon. The pdu operon is 
linked to the cob operon which encodes enzymes needed 

10 for the biosynthesis of vitamin B12 and both operons 
are subject to the regulation of the same activator 
protein encoded by the c pocE gene* 

Recently the genes encoding the diol dehydratase 
of Klebsiella oxytoca were cloned and sequenced and 

15 the genes were expressed in E. coli. Although active 
diol dehydratase was observed in these transf ormants, 
there is no evidence that these clones are able to 
metabolize a carbon substrate to 1, 3-propanediol . 

Various Salmonella and Klebsiella sp» are known 

20 to produce a diol dehydratase which catalyzes the 
conversion of 1, 2-propanediol, under anaerobic 
conditions, to propionaldehyde and eventually to 
1-propanol and propionic acid. The diol dehydratase 
has also been identified in Clostridia r and 

25 Propionlbacterivm but not in E. coll. The diol 

dehydratase from Klebsiella sp. can convert glycerol 
to 1,3-propanediol (Forage et al., Bacterid, 149, 413 
(1981)). 

Although the primary function of the pdu diol 
30 dehydratase is in the metabolism of 1, 2-propanediol, 
applicants have discovered that the expression of 
Km pneumoniae diol dehydratase in E. coll will 
catalyze the conversion of glycerol to 
1,3-propanediol. The recombinant bacteria expressing 
35 the diol dehydratase pathway converts glycerol to the 
desired 1, 3 -propanediol product and is not dependent 
on a linked system as with the glycerol dehydratase 
system. Applicants have discovered that 
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transformation of recombinant bacteria with the pdu 
diol dehydratase genes from . Klebsiella sp. affords a 
new, efficient and cost effective biological route to 
1 , 3-propanediol . 
5 snttMARY q f THF HEZEHXIQB 

The present invention comprises a cosmid 
comprising a DNA fragment of about 35 kb isolated from 
Klebsiella pneumoniae wherein said fragment encodes an 
active diol dehydratase enzyme having the restriction 

10 digest in Figure 5, columns numbered 4, 

The present invention further comprises a 
transformed microorganism comprising a host micro- 
organism and the above-described cosmid. 

The present invention further comprises a gene 

15 encoding an active diol dehydratase enzyme having the 
DNA sequence as listed in SEQ ID NO, si or a gene 
encoding an active alcohol dehydrogenase having the 
DNA sequence as listed in SEQ ID NO.:2. 

The present invention further comprises a 

20 transformed microorganism comprising a host micro- 
organism and either of the above-described genes. 
The present invention further comprises the 
byconversion of a carbon substrate by transforming a 
microbial host with genes capable of expressing a diol 

25 dehydratase and contacting said transformed host with 
said substrate. 

The present invention further comprises the 
bioconversion of a carbon substrate by transforming a 
microbial host with genes derived from a cosmid 

30 comprising a fragment of about 35 kb isolated from 
Klebsiella pneumoniae wherein said genes encode an 
active diol dehydratase enzyme and any other 
functional bacterial protein encoded by said cosmid, 
and contacting said transformed host with said 

35 substrate, 

ngg rPTPTTQN OF THE FTfiTTRES 
Figure 1 is a schematic representation of the 
gene organization of pdu-cob region of K. pneumoniae. 
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The DNA sequence was analyzed using the GCG-Wisconsin 
package and the open reading frames were compared to 
the S. typhirauri urn sequence using GAP. The percent 
identity and similarity are shown. 
5 Figure 2 is a comparison of the amino acid 

sequence encoded by the pduC gene of S. typhimurium 
with the amino acid sequence encoded by the pduC gene 
of JC.. pneumoniae. 

Figure 3 is a comparison of amino acid sequence 
10 coded by pduC gene of K. pneumoniae with that of 

glycerol dehydratase from citrobacter fxreundii showing 
percent similarity and percent identity. 

Figure 4 is a comparison of the amino acid 
sequence deduced from an open reading frame of the 
15 glycerol dehydratase gene from K. pneumoniae with the 
amino acid sequence encoded by the same gene from 
Citrobacter freundii. The figure shows the percent 
similarity and percent identity between the two 
deduced amino acid sequences . 

Figure 5 depicts restriction digests {EcoR 1, 
BamH 1, EcoR V and Notl) of cosmids pKPl, pKP2 and 
pKP4, labeled as columns 1, 2 and 4 respectively, and 
separation on a 0.8% agarose gel electrophoresis. 
Molecular size markers were loaded on the lanes in the 
end. Columns labeled as number 4 represent the cosmid 
containing a diol dehydratase enzyme. 

PT'T&TT.r.D p f . H^TPTTnw of THE INVENTION 
As used herein the following terms may be used 
for interpretation of the claims and specification. 

The term "construct" refers to a plasroid, virus, 
autonomously replicating sequence, phage or nucleotide 
sequence, linear or circular, of a single- or double- 
stranded DNA or RNA, derived from any source, in which 
a number of nucleotide sequences have been joined or 
35 recombined into a unique construction which is capable 
of introducing a promoter fragment and DNA sequence 
for a selected gene product along with appropriate 3' 
untranslated sequence into a cell. 
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The term "transformation" or "transfection" 
refers to the acquisition of new genes in a cell after 
the incorporation of nucleic acid* 

The term "expression" refers to the transcription 
5 and translation to gene product from a gene coding for 
the sequence of the gene product • In the expression, 
a DNA chain coding for the sequence of gene product is 
first transcribed to a complimentary RNA which is 
often a messenger RNA and, then, the thus transcribed 
10 messenger RNA is translated into the above-mentioned 
gene product if the gene product is a protein. 

The term "plasmid" or "vector" or "cosmid" as 
used herein refers to an extra chromosomal element 
often carrying genes which are not part of the central 
15 metabolism of the cell, and usually in the form of 
circular double-stranded DNA molecules. 

The term "carbon substrate" means any carbon 
source capable of being metabolized by a microorganism 
wherein the substrate contains at least one carbon 
20 atom. 

The term "dehydratase enzyme" will refer to any 
enzyme that is capable of converting a glycerol 
molecule to the product 3-hydroxypropionaldehyde . For 
the purposes of the present invention the dehydratase 
25 enzymes are either a glycerol dehydratase or a diol 
dehydratase having preferred substrates of glycerol 
and 1,2-propanediol, respectively. 

The term "1, 3 -propanediol" refers to a compound 
of the formula HOCH 2 -CH 2 -CH 2 QH, useful as a monomer in 
30 the production of polymers for fiber manufacture. 

The following strains were deposited under the 
terms, of the Budapest Treaty with the American Type 
Culture Collection (ATCC) (12301 Packlawn Drive, 
Rockville, MD 20852, USA): ATCC 69789 corresponds to 
35 E. coli DH5a containing cosmid pKPl. ATCC 69790 
refers to E. coll DH5a containing cosmid pKP4. 

The present invention comprises a process for a 
biological production of 1, 3 -propanediol from glycerol 
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using recombinant organisms* The process incorporates 
a transformed E. coll bacteria, transformed with a 
heterologous pdu diol dehydratase gene, having a 
specificity for 1, 2 -propanediol. The. transformed 
5 E. coli is grown in the presence of glycerol as a 

carbon source and l f 3-propanediol is isolated from the 
growth media . 

The process of the present invention provides a 
rapid, inexpensive and environmentally responsible 

10 source of 1, 3-propanediol monomer useful in the 
production of polyesters and other polymers . 

The invention provides a transformed host cell 
suitable for the expression of pdu diol dehydratase. 
Suitable host cells will generally be those that do 

15 not normally harbor a diol dehydratase gene. 

Preferred in the process of the present invention are 
Em coll f Bacillus subtills, Bacillus licheniformls or 
Pichia pastvrls. The diol dehydratase within the 
transformed host cell has been previously described by 

20 Toraya et al., J* Biol. Chem. r 252, 963, (1977). 
Isolation of Sanaa: 

The pdu diol dehydratase gene is obtained from 
any suitable source, but preferably from a bacteria 
known to be able to use 1, 2-propanediol as a sole 

25 carbon source. Suitable bacteria known to harbor the 
pdu gene include but are not limited to 
Klebsiella sp., Clostridia sp.. Salmonella sp. f and 
Cltrohacter sp. 

Methods of obtaining desired genes from a 

30 bacterial genome are common and well known in the art 
of molecular biology. In the present invention 
virtually any method may be used to isolate the gene 
encoding the desired diol dehydratase. For example, 
if the sequence of the gene is known f suitable genomic 

35 libraries created by restriction endonuclease 

digestion may be screened with probes complementary to 
the desired gene sequence. Once the sequence is 
isolated, the DNA may be amplified using standard 
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primer directed amplification methods such as 
polymerase chain reaction (PCR) (U.S. 4,683,202) to 
obtain amounts of DNA suitable for transformation 
using appropriate vectors, 
5 Alternatively cosmid libraries may be created 

where large segments of genomic DNA (35-45 kb) may be 
packaged into vectors and used to transform 
appropriate hosts . Cosmid vectors are unique in being 
able to accommodate large quantities of DNA. 

10 Generally cosmid vectors have at least one copy of the 
cos DNA sequence which is needed for packaging and 
subsequent circularization of the foreign DNA. In 
addition to the cos sequence these vectors will also 
contain an origin of replication such as ColEl and 

15 drug resistance markers such as a gene resistant to 
ampicillin or neomycin. A number of cosmid vectors 
are known in the art such as pJB8 (Ish-Horowicz et 
al., Nucl. Acids Res. 9, 2989 (1981)), containing an 
amp marker, ColEl origin of replication and a singel 

20 cos site; and, c2RB (Bates et al.. Gens, 26, 137, 

(1983)), containing 2 cos sites, both kanamycin and 
ampicillin resistance genes and the ColEl origin of 
replication. Although any cosmid vector is suitable 
for use in the present invention the vector Supercos 1 

25 provided by Stratagene (La Jolla r CA) is most 
preferred. 

Typically, to clone cosmids, foreign DNA is 
isolated and ligated, using the appropriate 
restriction endonucleases, adjacent to the cos region 

30 of the cosmid vector. Cosmid vectors containing the 
linearized foreign DNA is then packaged in vitro in 
DNA packaging vehicle such as bacteriophage X. During 
the packaging process the cos sites are cleaved and 
the foreign DNA is packaged into the head portion of 

35 the bacterial viral particle. These particles are 
then used to transfect suitable host cells such as 
E. coii. Ohce injected into the cell r the foreign DNA 
circularizes under the influence of the cos sticky 
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ends . In this manner, large segments of foreign DNA 
can be introduced and expressed in recombinant host 
cells* 

Cosmid vectors and cosmid transformation methods 
5 were used within the context of the present invention 
to clone large segments of genomic DNA from bacterial 
genera known to possess genes capable of processing 
glycerol to 1, 3 -propanediol. Specif ically, genomic 
DNA from JC. pneumoniae and it. aerogenes was isolated 

10 by methods well known in the art and digested with the 
restriction enzyme Sau3A for insertion into a cosmid 
vector Super cos 1 and packaged using Gigapackll™ 
packaging extracts. Following construction of the 
vector E. coli XLl-Blue MR cells were transformed with 

15 the cosmid DNA. Transformants were screened for the 
ability to convert glycerol to 1,3-propanediol by 
growing the cells in the presence of glycerol and 
analyzing the media for 1,3-propanediol formation* 
The DNA sequences generated from cosmid 

20 transformations named pKP4 and pKP5 were compared to 
DNA sequences in the Genbank data base. Several 
independent clones showing homology to pdu region of 
S. typhimurium were identified, suggesting that these 
transformants carried DNA encoding 1,2 propanediol 

25 utilizing enzymes including a 1,2-diol dehydratase 
gene. In contrast. In transformants named pKPl and 
pKP2, an open reading frame showed extensive homology 
to the glycerol dehydratase gene from C. freundiir 
suggesting that these transformants containing DNA 

30 encoding the glycerol dehydratase gene. 
C ftJJLa : 

The present invention further comprises a 
transformed host cell capable of converting a carbon 
substrate to 1,3-propanediol. As disclosed above, 
35 host cells may be transformed with a single gene, 

encoding the diol dehydratase, a series of specific 
genes encoding the diol dehydratase and other enzymes 
known to facilitate the process of byconversion or 
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with an entire cosmid DNA fragment. Preferred for use 
in the present invention is DH5a E. coll. However, it 
is contemplated that other cells will be amenable to 
transformation with the instant genes and will 
5 include, but are not limited to, other microorganisms 
such as Bacillus sp. , Klebsiella sp., Citrobacter sp., 
Clostridia sp- and Pichia sp. 
r;*T^nn gnhatrate; 

The present invention provides a carbon substrate 

10 which is converted to the desired 1, 3-propanediol end 
product via the enzymatic machinery of the transformed 
host organism. Virtually any carbon substrate that 
will serve as a substrate for a dehydratase enzyme is 
suitable for the present invention where alcohols are 

15 of greatest use. Preferred carbon substrates will 

include, but are not limited to, glycerol, ethylene- 
glycol, 1,2-propanediol, 1, 2-butanediol, and 
2, 3-butanediol, wherein glycerol is most preferred. 
Eax^ticat i M Isolation of i .^-propanedio l: 

2.0 Methods for the purification of 1, 3-propanediol 

from fermentation media are known in the art. . For 
example propanediols can be obtained from cell media 
by subjecting the reaction mixture to extraction with 
an organic solvent, distillation and column 

25 chromatography <U.S. 5356812). A particularly good 
organic solvent for this process is cyclohexane (U.S. 
5008473) . 

1,3-Propanediol may be identified directly by 
submitting the media to high pressure liquid 

30 chromatography (HPLC) analysis. Preferred in the 

present invention is a method where fermentation media 
is analyzed on an analytical ion exclusion column 
using a mobile phase of 0.01 N sulfuric acid in an 
isocratic fashion. 

35 The following Examples are meant to illustrate 

the invention but are not intended to limit it in any 
way. 
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EXAMPLES 

Restriction enzyme digestions, phosphorylations, 
ligations and transformations were done as described 
5 in Sambrook, J. et al., Molecular Cloning: A 

Laboratory Manual, Second Edition, Cold Spring Harbor 
Laboratory Press (1989) . GeneClean (Stratagene, 
La Jolla, CA) was used to remove enzymes from 
restriction digests, as specified by the 

10 manufacturers. Restriction enzymes were obtained from 
New England Biolabs (Boston, MA) or Promega (Madison, 
WI) Growth media was obtained from GIBCO/BRL 
(Gaithersburg, MD) 

The meaning of abbreviations is as follows : "h" 

15 means hour(s), "min" means minute(s), "sec" means 
second (s), and "d" means day(s). 
Media: 

Synthetic S12 medium was used in the screening of 
bacterial transformants for the ability to make 
.20 1,3 -propanediol. S12 medium contains: ammonium 

sulfate, 10 mM; potassium phosphate buffer, pH 7.0, 
50 mM/ MgCl 2 , 2 mM; CaCl 2 , 0.7 mM; MnCl 2 , 50 uM; FeCls, 
1 uM; ZnCl, 1 uM; CuS0 4 , 1.72 uM; CoCl 2 , 2.53 uM; 
Na 2 Mo0 4 , 2.42 uM; and thiamine hydrochloride, 2 uM. 
25 synthetic S15 medium was also used in the 

screening of bacterial transformants for the ability 
to make 1, 3 -propanediol. S15 medium contains: 
ammonium sulfate, 10 mM; potassium phosphate buffer, 
pH 7.0, 1 mM; MOPS/KOH buffer, pH 7.0, 50 mM; MgCl 2 , 
30 2 mM; CaCl 2 , 0.7 mM; MnCl 2 , 50 uM; FeCl 3 , 1 UM; ZnCl, 

1 uM; CUSO4, 1.72 uM; CoCl 2 , 2.53 uM; Na 2 Mo<> 4 , 2.42 uM; 

and thiamine hydrochloride, 2 uM. 

nm n Hnn identification i ■ a-nrnpanfttl l Q l ' 

The conversion of glycerol to 1, 3 -propanediol was 
35 monitored by HPLC. Analyses were performed using a 

waters Maxima 820 HPLC system using UV (210 nm) and RI 
detection. Samples were injected onto a Shodex 
SH-1011 column (8 mm x 300 mm, purchased from Waters, 
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Milford, MA) equipped with a Shodex SH-1011P precolumn 
(6 mm x 50 nun) , temperature controlled at 50°C, using 
0.01 N H2SO4 as rnoble phase at a flow rate of 
0.5 mL/roin. When quantitative analysis was desired, 
5 samples were prepared with a known amount of 

trimethylacetic acid as external standard. Typically, 
the retention times of glycerol (Rl detection) , 
1, 3-propanediol (RI detection) , and trimethylacetic 
acid (UV and RI detection) were 20.67 min f 26.08 min, 

10 and 35*03 min r respectively. 

Production of 1, 3-propanediol was confirmed by 
gas chromatography/mass spectrometry (GC/MS) with a 
Hewlett Packard 5890 Series II gas chromatograph 
coupled to a Hewlett Packard 5971 Series mass 

15 selective detector (EI) and a HP-INNOWax column (30 m 
length, 0.25 mm i.d. r 0.25 micron film thickness). 
The retention time and mass spectrum of 
1,3 -propanediol generated from glycerol were compared 
to that of authentic 1, 3 -propanediol (m/e: 57, 58). 

20 Cells 2 

Host cells used for cosmid transformations were 
E. coll DH50C fully described in Jesse et al., Focus, 
10, 69 (1988) and obtained from GIBCO/BRL. 
Coaai^acfciflP of EL pnmmnniaa and K. aerorr^n^ noam i d 

25 . Ufaraxiaai 

K. pneumoniae (ATCC 25955) and K. aerogenes 
{(K. pneumoniae or Aerobacter aerogenes) ECL 2106) 
obtained from Dr. E.C.C. Lin, Harvard Medical School, 
Cambride, MA, and described in Ruch r F.E. and Lin, 

30 E.C.C., Journal of Bacteriology, Vol. 124, p. 348 

(Oct. 1975) , were grown in 100 ml LB medium for 8 h at 
37*C with aeration. Bacteria (25 ml per tube) were 
centrifuged at 3,000 rpm for 15 min in a DuPont 
Sorvall GLC 2.B centrifuge at room temperature. The 

35 bacteria were pelleted and supernatant was decanted. 
The bacterial cell pellet was frozen at -20 °C The 
chromosomal DNA was isolated as outlined below with 
special care taken to avoid shearing of DNA (i.e., 
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vortexing was avoided) . One tube of bacteria was 
resuspended in 2.5 ml of 50 mM Tris-10 mM EDTA and 
500 ul of lysozyme (1 mg/ml) was added. The pellet 
was gently resuspended and the suspension was 
5 incubated at 37 »C for 15 min. Sodium dodecyl sulfate 
was added to bring the final concentration to 0.5%. 
This resulted in the solution becoming clear. 
Proteinase K (50 ug/ml) was added and the suspension 
was incubated at 55*C for 2 h. The tube was removed 

10 and transferred to an ice bath and sodium chloride was 
added to yield a 0.4 M final concentration. Two 
volumes of ethanol were added to the liquid. A glass 
tube was inserted to the interface and the DNA was 
gently spooled. DNA was dipped into a tube 

15 containing 70% ethanol. After drying in vacuo, the 
DNA was resuspended in 500 ul of water and the 
concentration of DNA was determined 

spectrophotometrically. A diluted aliquot of DNA was 
run on a 0.5% agarose gel to determine the intact 
20 nature of DNA. 

The chromosomal DNA was partially digested with 
Sau3A as outlined by Sambrook et al., supra. DNA 
{2 ug) was digested with 2 units of Sau3A (Promega, 
Madison, WI) at room temperature in 200 ul of total 

25 volume. At 0, 5, 10 and 20 min, samples (50 ul) were 
removed and transferred to tubes containing 5 umol of 
EDTA. These tubes were incubated at 70*C for 10 min. 
An aliquot (2 ul) was withdrawn and analyzed on a 0.5% 
agarose gel electrophoresis to determine the level of 

30 digestion and the rest of the sample (48 ul) was 

stored at -20 °C. The gel was stained witn ethidium 
bromide and visualized under UV to determine the 
partial digestion of the chromosomal DNA. A decrease 
in the size of the chromosomal DNA with increase in 

35 time was observed showing that the decrease in the 
size of the chromosomal DNA is due to the action of 
Sau3A. DNA was extracted from rest of the sample by 
standard protocol methods (Sambrook et al., supra). 
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A cosmid library of partially digested DNA from 
K. pneumoniae or K. aerogenes was prepared using 
Supercos cosmid vector kit and Gigapackll™ packaging 
extracts using reagents purchased from Stratagene 
5 (La Jolla, CA) . The instructions provided by the 
manufacturer were followed . The packaged 
K. pneumoniae contained 4 x 10 4 to 1.0 x 10 5 phage 
titer and the packaged K. aerogenes contained 1.2 x 10 5 
phage per mL as determined by transfecting E . coll 
10 XLl-Blue MR. 

Cosmid DNA was isolated from 6 of the B. coll 
transformants and found to contain large insert of DNA 
(25 to 30 kb) • 

EXAMPLE 1 

!5 fipr^n-inff PT ^rainfl. t. ran fl formed With 

* ^OfltTiid 1 ihrary IYMA from fC. nnfiUffinnfflfl and 
nnntainjnp 1rhft g l y^rol dehydratase fHlZVma 
thah ornHnnft 1,3- nrr>r>anedio.l 
Example 1 demonstrated the screening of 
20 transformed E. coll cells with cosmid library DNA from 
K. pneumoniae for the presence of an enzyme that 
converted glycerol to 1, 3-propanediol . Sequencing of 
two positive clones revealed that each contained a 
gene with a high degree of homology to the gone 
25 encoding glycerol dehydratase. 

Six transformation plates containing 
approximately l f 000 colonies of E. coll XLl-31ue MR 
transfected with X. pneumoniae DNA were washed with 
5 ml LB medium and centrifuged. The bacteria were 
30 pelleted and resuspended in 5 ml LB medium + glycerol. 
An aliquot (50 ul) was inoculated into a 15 ml tube 
containing S12 synthetic medium with 0.2% glycerol + 
400 ng per ml of vitamin B 12 + 0.001% yeast extract + 
50 ug/ml ampicillin (50 amp) . The tube was filled 
35 with the medium to the top, wrapped with parafilm and 
incubated at 30*C. A slight turbidity was observed 
after 4 8 h. Aliquot s, analyzed for product 
distribution as described above at 78 h and 132 h f 
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were positive for i; 3-propanediol, the later time 
points containing increased amounts of 
1 , 3-propanediol . 

The bacteria, testing positive for 
1, 3-propanediol production were plated onto a LB + 
50 amp, and serial dilutions were performed in order 
to isolate single colonies. Forty-eight single 
colonies were isolated and checked again for the 
production of 1, 3-propanediol. Cosroid DNA was 
isolated from 6 independent clones and transformed 
into E. coli strain DH5a. The transformants were 
again checked for the production of 1, 3-propanediol . 
Two transformants were characterized further and 
designated as DHSa-pKPl and DH5a-pKP2. 

DNA sequence analyses of DH5<x-pKPl and DH5a-pKP2 
showed the presence of both glycerol dehydrogenase and 
glycerol dehydratase genes. Furthermore, the glycerol 
dehydratase gene of the transformed E. coli shared 96% 
similarity and 95% identity to the glycerol 
dehydratase gene from Cicrobacter freundii (Figure 4) . 
Thus, pKP 1 and 2 appeared to contain the dha regulon 
genes from K. pneumoniae. 

^«°ni ^ FT- r-nli drains. M-ansf QrfflPfl W i th 
n -.^tn-iri nWA f-rrvm K. nnettfflOnfflfi BllC i 

rn"^<"^» the LJSsprapaasdial dftTiyrtrntase . 
nn7 i Y m^ »h«i- prnrtuep i . S-pronnnpnUo l 

Example 2 demonstrated the screening of E. coli 
cells, transformed with cosmid library DNA from 
K. pneumoniae, for the presence of an active enzyme 
that enabled the conversion of glycerol to 
1, 3-propanediol. Sequencing of the positive clones 
revealed that each contained a gene with a high degree 
of homology to the gene encoding 1, 2 -propanediol 
dehydratase, encoded by the pdu operon. 

Single colonies of E. coli XLl-Blue MR 
trans fected with packaged cosmid DNA from 
K. pneumoniae were inoculated into microtiter wells 
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containing 200 ul of S15 medium +0.2% glycerol + 
400 ng/rol of vitamin B 12 + 0.001% yeast extract + 
50 ug/ml ampicillin (50 amp) . In addition to the 
microtiter wells, a master plate containing LB + 
5 50 amp was also inoculated. After 96 h, 100 ul was 
withdrawn and centrifuged in a Rainin microfuge tube 
containing a 0.2 micron nylon membrane filter. 
Bacteria were retained and the filtrate was processed 
for HPLC analysis . Positive clones demonstrating 

10 1,3-propanediol production were identified after 

screening approximately 240 colonies. Three positive 
clones were identified, two of which had grown on LB + 
50 amp and one of which had not. Single colonies were 
isolated from the two positive clones grown on LB + 

15 50 amp and verified for. the production of 

1,3-propanediol and designated as pKP4 and pKP5 . 
Cosmid DNA was isolated from B. coll strains 
containing pKP4 and pKP5 and E, coli strain DH5a was 
transformed. Six independent trans formants were 

20 verified for the production of 1,3-propanediol. 

E. coll strain DH5a containing pKP4 or pKP5 was able 
to convert glycerol to 1,3-propanediol as described 
below . 

graduation ftf 1 - asErapanfldioJ with 

25 n, a fc ralna PHSrr-nKP4 and mrftffi-pKPS 

A 2 mi screw capped cyrogenic vial, filled, to 
capacity with media, was inoculated with E. coli 
strain DH5a containing pKP4 or pKP5 and incubated at 
30°C. The media was composed of S12 medium 

30 supplemented with 0.01% yeast extract, 0.008% casamino 
acids, 50 ug/mL ampicillin, 10 ug/mL kanamycin, 
0.4 ug/mL vitamin B12, and either 0.2% glycerol or 0.1% 
glycerol plus 0.1% D-glucose. Inoculation was 
performed directly from an agar plate culture <LB 

35 supplemented with 50 ug/mL ampicillin) . After 66 hr, 
growth was determined by the absorbance at 600 nm 
<OD 600 ) and the extent of reaction and product 
distribution determined by HPLC. The results are 
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presented in Table 1 and Table 2: the sample is 
identified by the trans formant with a suffix notation 
indicating independent transfomtants, Gly is glycerol, 
Glu is D-glucose, Con. is conversion, Sel. is 
selectivity, Yld is yield, and NA is not applicable. 
Conversion, selectivity and yield were based on 
glycerol consumption . 

■TABLE 1 



Sample 


Production of 1, 3 -Propanediol from Glycerol 
tGlyl [1,3-propanediol] % 

ODcnn <fflM> (xnM) Con. - 


% 

Sel. 


% 

TTld. 


media 


NA 


23.0 


0.0 


NA 


NA 


HA 


pKP4-3 


0.206 


14.0 


1.0 


39 


11 


4 


pKP4-4 


0.297 


12.6 


1.6 


45 


15 


7 


pKP5-l 


0.242 


13*4 


0.8 


42 


6 


4 


pKP5-2 


0.300 


13.4 


1.4 


42 


15 


6 



10 



15 



20 



Production of 1, 3 -Propanediol from Glycerol and Glucose 
[Gly] [Glu] [1,3-propanediol] % * 
Sagele OPfif)" <*M> fiHl S2S: 



% 

Yld. 



media 

pKP4-3 

pKP4-4 

pKP5-l 

pKPS-2 



NA 

0.257 
0.321 
0.366 
0.367 



10-7 
5.3 
3.9 
1.9 
1.5 



4.3 
0.0 
0.0 
0.2 
0.2 



0.0 
1.0 
1.2 
3.6 
4.1 



NA 
50 
64 
86 
86 



NA 
19 
18 
39 
45 



NA 
9 
11 
34 
36 



The size of insert DKA in the case of both pKP4 
and pKP5 varied from 25 to 30 Kb. Both clones had 
certain fragments that were common and certain 
fragments were different. A 22 kb EcoRl fragment from 
P KP4 was eluted from an agarose gel using GeneClean 
and later digested with BamHl or EcoRV and the various 
fragments were subcloned into plasmid pIBI31 digested 
with EcoRl or BamHl or KinCII. Clones containing 
inserts were identified and DNA sequence was 
generated. 
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The DNA sequence that was generated showed 
homology to the cot and pocR and pdv genes of 
S. typhimurium. it is well known that: the pcfu operon 
in S. typhimurivm codes for genes that are needed for 
5 l, 2-propanediol utilization* (Bobik et al., 

J. Bacteriol, 174, 2253 (1992) >. Similarly, it is 
known that the coh operon encodes genes that are 
needed for vitamin B12 synthesis. Within the pdu 
operon it is further recognized that the pduC gene 

10 encodes for diol dehydratase production • 

The region of it. pneumoniae coding for the pcfu 
operon genes is shown in Figure 1. Figure 1 is a 
schematic representation of the gene organization of 
pdu-cob region of K. pneumoniae. Comparisons were 

15 made between this pdv-cob region and the same regions 
of the gene belonging to 5. typhxmurlum using 
algorithms provided by Sequence Analysis Software of 
the University of Wisconsin (Genetics Computer Group, 
(1991), Version 7, April 1991, 575 Science Drive, 

20 Madison, WX, USA 53711) . A table giving the percent 
identity and similarity as calculated by the GAP 
program of the Genetics Computer Group are shown 
below. 

pocR 90.48% 84.35% 

pduA 100% 94.85% 

jxUiB 99.16% 96.64% 

pduC 96.31% (partial seq.) 94.92% 

pduT 92.42% 82.20% 



25 



30 



As can be seen by this comparison, and in 
Figure 2, the pduC open reading frame showed extensive 
homology (98,31%) to the pduC gene of S. tzyphlmurium. 
pduC was linked to pduF and showed homology to the 
gene encoding glycerol dehydratase from Cltrobacter 
freundii (Figure 3) . 

Figure 3 is a comparison of the deduced amino 
acid sequence encoded by the pduC gene from 
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K. pneumoniae vs . the amino acid sequence encoded by 
the glycerol dehydratase gene of C. freundii. These 
comparisons showed that the percent similarity was 
only 84% and the identity only 70%. Thus r the pduC 
5 gene encoding for diol dehydratase was a clearly 

different enzyme and is being used to convert glycerol 
to 1,3-propanediol in these transformed E. coli 
strains. The sequence of the gene encoding this diol 
dehydratase enzyme is given in SEQ ID N0.:1. 

10 Additionally, another open reading frame has been 

identified on the pcfu gene which showed a high degree 
of homology with the regions encoding alcohol dehydro- 
genases. Tor example, deduced amino acid comparisons 
showed that this open reading frame had 43% homology 

15 with E. coli alcohol dehydrogenase and a 54% homology 
with the oxidoreductase of C. freundxi. This open 
reading frame had been sequenced and is identified as 
SEQ ID NO.:2. 

EXAMPLE 3 

20 fl r r^THn 7 p. nnii strains, transformed wi f . h a 

rnftTn -i^ Hhrary ™JA from K a*maf>nPfi 

rnn ^Ti-ing fbP yiynftrol dehydratase 

Example 3 demonstrated the screening of 
25 transformed E. coli cells with cosmid library DNA from 
K. aerogenes for the presence of an active enzyme that 
converted glycerol to 1, 3 -propanediol- Sequencing of 
the positive clones revealed that each contained a 
gene with a high degree of homology to the gene 
30 encoding 1, 2 -propanediol dehydratase, endcoded by the 
pdu operon. 

Single colonies of E. coli XLl-Blue MR 
transfected with DNA from ft. aerogenes were inoculated 
into microtiter wells containing 200 ul of S15 medium 
35 +0.2% glycerol + 400 ng per ml of vitamin B x2 + 0.001% 
yeast extract +50 ug/ml ampicillin (50 amp) . 

Culture supernatant was analyzed for the presence 
of 1,3-propanediol after 96 h. Two colonies were 
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positive from 2 microtiter plates but after 1 weefc at 
room temperature the bacteria were not viable. A 
third mirotiter plate was inoculated and a master 
plate containing LB + 50 amp was also inoculated. One 
5 positive clone labelled KAE3E10 was identified. The 
masterplate containing KAE3E10 used to replate the 
positive clone and cosmid DNA was isolated. DH5a 
cells were transformed with KAE3E10 DNA and trans- 
formants were screened for the conversion of glycerol 

10 to 1,3-propanediol. KAE3E10 was renamed pKA3 and 

contained an insert of approximately 40 kb. The DNA 
sequence of pKA3 showed a region that was homologous 
to cob and pocR and pdu operon of S. typhimurium. 
Thus, it appeared that pKA3 also coded for a 

15 l, 2-propanediol utilizing operon. Diol dehydratase 
was presumably responsible for the conversion of 
glycerol to 1 , 3-propanediol . 
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. SEQUENCE LISTING 
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(i) APPLICANT: 

(A) NAME: E. I. DU PONT DE NEMOURS AND COMPANY 

(8) STREET: 1007 MARKET STREET 

(C) CITY: WILMINGTON 

(D) STATE: DELAWARE 

(E) COUNTRY: UNITED STATES OF AMERICA 
<P) POSTAL CODE (ZIP) : 19898 

(G) TELEPHONE: 302-892-8112 

(H) TELEFAX: 302-773-0164 

(I) TELEX: 6717325 

(ii) TITLE OF INVENTION: PRODUCTION OF 1, 3 -PROPANEDIOL 

FROM GLYCEROL BY RECOMBINANT 
BACTERIA EXPRESSING RECOMBINANT 
DIOL DEHYDRATASE 

(iii) NUMBER OF SEQUENCES: 8 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: DISKETTE, 3.50 INCH 

(B) COMPUTER: IBM 

(C) OPERATING SYSTEM: MICROSOFT WINDOWS 3.1 

(D) SOFTWARE: MICROSOFT WORD 2.0C 

(v) CURRENT APPLICATION DATA: 
(A> APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/440,377 

(B) FILING DATE: MAY 12, 1995 

•<vli) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: LINDA AXAMETHY FLOYD 
<B) REGISTRATION NUMBER: 33,692 

(C) REFERENCE /DOCKET NUMBER: CR-9692-A 
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<2) INFORMATION FOR SEQ ID NO:l; . 

(i> SEQUENCE CHARACTERISTICS: 

<A> LENGTH: 474 6 base pairs 
(B) TYPE: nucleic acid 
- <C> STRAKDEDNESS : single 
(D) TOPOLOGY: linear 





(ii) MOLECULE TYPE: 


DNA (genomic) 








(2i) SEQUENCE DESCRIPTION: SEQ 


ID NO:l: 






ATGAGATCGA 


AAAGATTTGA 


AGCACTGGCG 


AAACGCCCTG 


TGAATCAGGA 


TGGTTTCGTT 


60 


AAGGAGTGGA 


TTGAAGAGGG 


CTTTATCGCG 


ATGGAAAGTC 


CTAACGATCC 


CAAACCTTCT 


120 


ATCCGCATCG 


TCAACGGCGC 


GGTGACCGAA 


CTCGACGGTA AACCGGTTGA 


GCAGTTCGAC 


180 


£7K3AT TQACC 


ACTTTATCGC 


GCGCTACGGC 


ATTAATCTCG 


CCCGGGCCGA 


AGAAGTGATG 


240 


GCCAT GGAT T 


CGGTTAAGCT 


CGCCAACATG 


CTCTGCGACC 


CGAACGTTAA 


ACGCAGCGAC 


300 


ATCGTGCCGC 


TCACTACCGC 


GATGACCCCG 


GCGAAAATCG 


TGGAAGTGGT 


GTCGCATATG 


360 


AACGTGGTCG 


AGATGATGAT 


GGCGATGCAA 


AAAATGCGCG 


CCCGCCGCAC 


6CCGTCOCAG 


420 


CAjGGCGCATG 


TCACTAATAT 


CAAAGATAAT 


CCGGTACAGA 


TTGCCGCCGA 


CGCCGCTGAA 


480 


GGCGCATGGC 


GCGGCTTTGA 


CGAACAGGAG 


ACCACCGTCG 


CCGTGGCGCG 


CTACGCGCCG 


540 


TTCAACGCCA 


TCGCCCTGCT 


GGTGGGTTCA 


CAGGTTGGCC 


GCCCCGGCGT 


CCTCACCCAG 


600 


A. W-A A WW* 


AAGAAGCCAC 


COAGCTGAAA 


CTGGGCATGC 


TGGGCCACAC 


CTGCTATGCC 


660 


CJAJVACCATTT 


CGGTATACGG 


TACGGAACCG 


GTGTTTACCG ATGGCGATGA 


CACTCCATGG 


720 


TCGAAAGGCT 


TCCTCGCCTC 


CTCCTACGCC 


TCGCGCGGCC TGAAAATGCG 


CTTTACCTCC 


780 


GGTTCCGGTT 


CTGAAGTACA 


GATGGGCTAT 


GCCGAAGGCA AATCGATGCT 


TTATCTCGAA 


840 


GCGCGCTGCA 


TCTACATCAC 


CAAAGCCGCC 


GGGGTGCAAG GCCTGCAGAA 


TGGCTCCGTC 


900 


AGCTGTATCG 


GCGTACCGTC 


CGCCGTGCCG 


TCCGGGATCC 


GCGCCGTACT 


GGCGGAAAAC 


960 


CTGATCTGCT 


CAGCGCTGGA 


TCTGGAGTGC 


GCCTCCAGCA ACGATCAAAC 


CTTTACCCAC 


1020 


TCGGATATGC 


GGCGTACCGC 


GCGTCTGCTG 


ATGCAGTTCC 


TGCCAGGCAC 


CGACTTCATC 


1080 


TCCTCCGGTT 


ACTCGGCGGT 


GCCCAACTAC 


GACAACATGT 


TCGCCGGTTC 


CAACGAAGAT 


1140 


GCCGAAGACT 


TCGATGACTA 


CAACGTGATC 


CAGCGCGACC 


TGAAGGTCGA 


TGGCGGCCTG 


1200 


CGGCCGGTGC 


GTGAAGAGGA 


CGTGATCGCC 


ATTCGCAACA AAGCCGCCCG 


CGCGCTGCAG 


1260 


GCGGTATTTG 


CCGGCATGGG 


TTTGCCGCCT 


ATTACGGATG AAGAGGTAGA 


AGCCGCCACC 


1320 


TACGCCCACG 


GTTCAAAAGA 


TATGCCTGAG 


CGCAATATCG TCGAGGACAT 


CAAGTTTGCT 


1380 
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CAGGAGATCA TCAACAAGAA CCGCAAC.GGC CTGGAGGTGG TGAAAGCCCT GGCGAAAGGC 1440 

GGCTTCCCCG ATGTCGCCCA GGACATGCTC AATATTCAGA AAGCCAAGCT CACCGGCGAC 1500 

TACCXGCATA CCTCCGCCAT CATTGTTGGC GAGGGCCA6G TGCTCTCGGC CGTGAATGAC 1560 

GTGAACGATT ATGCCGGTCC GGCAACAGGC TACCGCCTGC AAGGCGAGCG CTGGGAAGAG 1620- 

ATTAAAAATA TCCCGGGCGC GCTCGATCCC AATGAACTTG GCTAAGGGGT GAAAAATGGA 1680 

AATTAACGAA ACGCTGCTGC GCCAGATTAT CGAAGAGGTG CTGTCGGAGA TGAAATCAGG 1740 

CGCAGATAAG CCGGTCTCCT TTAGCGCGTC TGCGGCTTCT GTCGCCTCTG CCGCGCCGGT 1800 

CGCCGTTGCG CCTGTGTCCG GCGACAGCTT CCTGACGGAA ATCGGCGAAG CCAAACCCGG 18 60 

CACGCAGCAG GATGAAGTCA TTATTGCCGT CGGGCCAGCG TTTGGTCTGG CGCAAACCGC 1920 

CAATATCGTC GGCATTCCGC ATAAAAATAT TCTGCGCGAA GTGATCGCCG GCATTGAGGA 1980 

AGAAGGCATC AAAGCCCGGG TGATCCGCTG CTTTAAGTCA TCTGACGTCG CCTTCGTGGC 2040 

AGTGGAAGGC AACCGCCTGA GCGGCTCCGG CATCTCGATC GGTATTCAGT CGAAAGGCAC 2100 

CACCGTCATC CACCAGCGCG GCCTGCOGCC GCTTTCCAAT CTGGAACTCT TCCCGCAGGC 2160 

GCCGCTGCTA ACGCTGGAAA CCTACCGTCA GATTGGCAAA AACGCCGCGC GCTACGCCAA 2220 

ACGCGAGTCG CCGCAGCCGG TGCCGACGCT TAACGATCAG ATGGCTCGTC CCAAATACCA 2280 

GGCGAAGTCG GCCATTTTGC ACATTAAAGA GACCAAAXAC GTGGTGACGG GCAAAAACCC 2340 

GCAGGAACXG CGCGTGGCGC TTTAACAAAG GATATCCCGA TGAATACCGA CGCAATTGAA 2400 

TCCATGGTAC GCGACCTGCT GAGCCGGATG AACAGCCTAC AGGACGGGGT AACGCCCGCG 2460 

CCAGCCGCGC CGACAAACGA CACCGTTCGC CAGCCAAAAG TTAGCGACTA CCCGTTAGCG 2520 

ACCTGCCATC CGGAGTGGCT CAAAACCGCT ACCAATAAAA CGCTCGATGA CCTGACGCTG 2580 

GAGAACGTAT TAAGCGATCG CGTTACGGCG CAGGACATGC GCATCACTCC GGAAACGCTG 2640 

CCTATGCAGG CGGCGATCGC CCAGGATGCC GGACGCGATC GGCTGGCGAT GAACTTTGAG 2700 

CGGGCCGCAG AGCTCACCGC GGTTCCCGAC GAOCGAATCC TTGAGATCTA CAGCGCCCTG 2760 

CGCCCATACC GTTCCACCCA GGCGGAGCTA CTGGCGATCG CTGATGACCT CGAGCATCGC 2020 
TACCAGGCAC GACTCTGTGC CGCCTTTGTT CGGGAAGCGG CCGGGCTGTA CATCGAGCGT 
AAGAAGCTGA AAGGCGACGA TTAACAGGGG GTAAGCATGC GCTATATCGC TGGCATTGAT 
ATTGGCAACT CCTCGACAGA AGTCGCCCTG GCGACGGTCG ATGACGCAGG TGTGCTGAAC 3000 
ATTCGCCACA GCGCGTTGGC TGAAACCACG GGTATAAAAG GCACATTACG AAATGTGTTC 3060 
GGTATCCAGG AGGCGCTAAC GCAGGCGGCA AAAGCGGCCG GCATTCAGCT CAGCGATATT 3120 

23 
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TCGCTTATTC GCAlTAACGA AGCCACGCCG GTCATTGGCG ATGTGGCGAT GGAAACCATC 
ACGGAAACCA TCATCACCGA GTCCACCATG ATCGGCCATA ACCCGAAGAC ACCCGGCGGC 
GTCGGACTGG GGGTCGGCAT CACCATCACA CCAGAGGCGC TGCTGTCCTG CTCCGCGGAC 
ACTCCCTATA TTCTGGTGGT CTCCTCGGCC TTTGACTTTG CCGATGTCGC CGCGATGGTC 
AATGCGGCAA CGGCAGCGGG CTATCAGATA ACCGGCATTA TTTTGCAGCA GGATGACGGC 
GTGCTGGTCA ATAACCGGCT ACAGCAACCG CTACCGGTGA TCGACGAAGT TCAGCATATC 
GACCGGATTC CACTTGGCAT GCTGGCGGCC GTCGAGGTCG CTTTACCCGG TAAGATCATC 
GAAACGCTCT CCAACCCTTA CGGTATTGCG ACCGTTTTCG ATCTCAACGC CGAGGAGAGC 
CAAAATATCG TGCCAATGGC ACGGGCGCTG ATTGGCAACC GCTCGGCCGT GCTGGTGAAA 
ACCCCCTCCG GCGACGTCAA GGCCCGCGCT ATTCCGGCAG GTAATCTGTT GCXCATCGCT 
CAGGGGCGCA GCGTACAGGT TGATGTGGCC GCCGGGGCGG AAGCCATCAT GAAAGCGGTT 
GACGGCTGCG ' GCAAACTGGA CAACGTCGCG GGAGAAGCGG GCACCAATAT CGGCGGCATG 
CTAGAGCACG TGCGCCAGAC CATGGCGGAG CTTACCAATA AGCCAGCTCA GGAGATCCGC 
ATTCAGGATC TGCTGGCCGT TGATACGGCG GTGCCAGTCA GCCTGACCGG CGGTCTTGCG 
GGGGAGTTCT CGCTGGAGCA GGCGGTGGGT ATCGCCTCGA TGGTCAAGTC GGATCGCCTG 
CAGATGGCCC TCATCGCCCG TGAAATTGAG CACAAACTGC AGATTGCGGT TCAGGTGGGC 
GGCGCCGAAG CGGAGGCGGC CATTCTTGGG GCGCTCACCA CTCCCGGCAC CACGCGCCCG 
CTGGCGATCC TCGATCTGGG CGCCGGGTCG ACCGACGCCT CCATTATCAA TGCGCAGGGA 
GAGATCAGCG CCACTCACCT GGCCGGCGCC GGCGATATGG TCACGATGAT CATCGCCCGC 
GAGCTGGGGC TTGAGGACCG CTACCTGGCG GAAGAGATCA AAAAATATCC GCTGGCAAAA 
GTCGAAAGCC TGTTTCATCT GCGTCATGAA GACGGCAGCG TCCAGTTTTT TCCGTCGGCC 
TTACCACCGA CGGTATTTGC CCGCGTCTGC GTGAAACCGG ATGAACTGGT TCCCCTGCCC 
GGCGATCTGC CGCTGGAGAA AGTGCGCGCA ATTCGCCGTA GCGCCAAATC ACGCGTCTTT 
GTCACCAACG CCCTGCGAGC GTTACGCCAG GTGAGCCCTA CCGGCAACAT TCGCGACATC 
COGTTCGTGG TGCTGGTGGG CGGCTCOTCC CTCGATTTCG AGATCCCCCA GCTGGTCACC 
GACGCGCTGG CGCACTACCG GCTGGCTGCC GGGCGCGGCA ACATCCGCGG CTGTGAAGGC 
CCACGCAATG CGGTCGCCAG CGGATTACTC CTTTCCTGGC AAAAAGGAGG CACACATGGA 
GAGTAG 



3180 

32*0 

3300 

3360 

3420 

3460 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4746 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 1335 base pairs 

(B) TYPE : nucleic acid 

(C) 5TRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
( ii i ) HYPOTHETICAL : NO 
<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
ATGCATACCT TTTCTCTGCA AACGCGCCTC TACAGCGGCC CGGGCAGCCT GGCCGCGCTG 60 
CAGCGCTTTA GCCATCAGCA CATCTGGATC GTCTGCGACG GCTTCCTGGC GCGCTCGCCG 120 
CTGCTTGACC GACTGCGCGC CGCGCTGCCC GCCAGCAACC GCGTCAGCGT GTTCAGCGAT 180 
ATTACACCGG ATCCGACCAT TCACACCGTG GCGAAAGGGA TAGCGCAGAT GCAGGCCCTG 240 
CGTCCGCAGG TGGTGATCGG CTTCGGCGGC GGCTCGGCGA TGGATGCCGC CAAGGCTATC 300 
GTCTGGTTCA GCCAGCAGGG CGGTCTGCCT GTTGACACCT GCGTGGCGAT CCCCACCACC 360 
AGCGGTACCG GTTCGGAAGT GACCAGCGCC TGCGTCATCA GCGACCCGGA AAAAGGGATC 420 
AAGTACCCGC TGTTCCATGA GGCGCTCTGT CCCGACATGG CGATCATCGA CCCGACGCTG 480 
GTGGTTAGCG TACCGCCCAC CATCACAGCC CATACCGGGC TGGACGCGCT GACCCACGCC 540 
CTGGAGGCAT GGGTCTCGCC GCAGGCCACC GATTTTACCG ATGCGCTGGC GGAAAAGGCC 600 
GCCAGGCTGG TGTTTCGCGC CCTGCCCGTT GCGATTCGTC AGGGCGACTG CATTGCGACC 
CGCAGCAAAA TGCACAATGC ATCAACCCTC GCCGGTATGG CCTTTAGCCA GGCTGGCCTT 
GGGCTCAATC ATGCGATCGC CCATCAGCTT GGCGGCCAGT TTCACCTCCC CCATGGCCTG 
GCCAATGCGC TGCTGCTGAC CGCGGTGATC CGCTTCAATG CCGGCGAGCC GCGAGCGGCT 
AAGCGCTATG CACGCCTGGC CAGGGCCTAC CGCTTCTGCC CGCCCGCAGC TGGCGAACAG 
GAGGCTTTCC AGGCGCTGCT TACCGCGGTG GAAACGCTGA AACAGCAGTG CGCCATTCCC 
CCCCTCAAGG GCGCGCTGCA GGAAAAGTAT CCCCTTTTCT TATCGCATCA ACCAGTTCAA 
CATCATTGCT CAGACGCACC TGCCCGCACA GCACGAAACC GACCAGGTGG CCGGCAATCA 
CCAGCGGGAT GGAAAAATCG GTTAACCCCG CATGACAGCG GTAGATACAC AGCTGTCTTT 
TTTCGAGGCT TCCAGCCCGC CGCAGCGGTC GCTCATGCGA CAGCGTCCGC TGTGCTCCGG 
GTGCTGACGC ATCAGCTGGC AAAACGGCGT GAAATTAAAC AATTCAGAAA TCTCATCACC 



660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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GTGAATATTG ACGACCACAA CCGCCAGACT GGTGGCTTGC GCAAAATCCT GTGCGATTTT 1320 
ATTGATGAGT TCTGA l3a5 
(2) INFORMATION FOR SEQ ID NO: 3: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 100 amino acids 

(B ) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 

<iii) HYPOTHETICAL: NO 

<aci) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Arg Ser Lys Arg Phe Glu Ala Leu Ala Lye Arg Pro Val Asn Gin 
15 10 15 

Asp Gly Phe Val Lys Glu Trp He Glu Glu Gly Phe He Ala Met Glu 
20 25 30 

Ser Pro Asn Asp Pro Lys Pro Ser He Arg He Val Asn Gly Ala Val 
35 40 45 

Thr Glu Leu Asp Gly Lys Pro Val Glu Gin Phe Asp Leu He Asp His 
50 55 60 * 

Phe He Ala Arg Tyr Gly He Asn Leu Ala Arg Ala Glu Glti Val Met 
65 70 75 90 

Ala Met Asp Ser Val Lys Leu Ala Asn Met Leu Cys Asp Pro Asn Val 
85 90 95 

Lys Arg Ser Asp 
100 

(2) ' INFORMATION FOR SEQ ID NO:4j 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

<iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Arg Ser Lys Arg Phe Glu Ala Leu Ala Lys Arg Pro Val Asn Gin 
1 5 10 15 

Asp Gly Phe Val Lys Glu Trp He Glu Glu Gly Phe He Ala Met Glu 
20 25 30 
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Ser Fro Asn Asp Pro Lys Pro Ser lie Lys He Val Asn Gly Ala Val 
35 40 45 

Thr Glu Leu Asp Gly Lys Pro Val Ser Glu Phe 
50 55 

<2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 554 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
<D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(111) HYPOTHETICAL: NO 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO; 5: 

Met Arg Ser Lys Arg Phe Glu Ala Leu Ala Lys Arg Pro Val Asn Gin 

1-5 " 15 

Asp Gly Phe Val Lys Glu Trp He Glu Glu Gly Phe He Ala Met Glu 
20 25 30 

Ser Pro Asn Asp Pro Lys Pro Ser He Arg He Val Asn Gly Ala Val 
35 40 45 

Thr Glu Leu Asp Gly Lys Pro Val Glu Gin Phe Asp Leu He Asp His 
50 55 60 

Phe He Ala Arg Tyr Gly He Asn Leu Ala Arg Ala Glu Glu Val Met 
65 70 75 80 

Ala Met Asp Ser Val Lys Leu Ala Asn Met Leu Cys Asp Pro Asn Val 
05 9° 9 

Lys Arg Ser Asp He Val Pro Leu Thr Thr Ala Met Thr Pro Ala Lys 
100 105 H° 

He Val Glu val Val Ser His Met Asn Val Val Glu Met Met Met Ala 
X15 120 125 

Met Gin Lys Met Arg Ala Arg Arg Thr Pro Ser Gin Gin Ala His Val 
130 "0 

Thr Asn lie Lys Asp Asn Pro Val Gin He Ala Ala Asp Ala Ala Glu 
i45 iso 155 



Gly Ala Trp Arg Gly Phe Asp Glu Gin Glu Thr Thr Val Ala Val Ala 

170 1/3 

Ser 

ISO " 185 



165 

Arg Tyr Ala Pro Phe Asn Ala He Ala Leu Leu val Gly Ser Gin Val 



27 



WO 96/35795 PCT/US96/Q6163 

Gly Arg Pro Gly Val Leu Thr Gin Cys Ser Leu Glu Glu Ala Thr Glu 
195 200 205 

Leu Lys Leu Gly Met Leu Gly His Thr Cys Tyr Ala Glu Thr lie Ser 
210 215 220 

Val Tyr Gly Thr Glu Pro Val Phe Thr Asp Gly Asp Asp Thr Pro Trp 
225 230 235 240 

Ser Lys Gly Phe Leu Ala Ser Ser Tyr Ala Ser Arg Gly Leu Lys Met 
245 250 255 

Arg Phe Thr Ser Gly Ser Gly Ser Glu Val Gin Met Gly Tyr Ala Glu 
260 265 270 

Gly Lys Ser Met Leu Tyr Leu Glu Ala Arg Cys lie Tyr lie Thr Lys 
275 280 285 

Ala Ala Giy Val Gin Gly Leu Gin Asn Gly Ser Val Ser Cys lie Gly 
290 295 300 

Val Pro Ser Ala Val Pro Ser Gly lie Arg Ala Val Leu Ala Glu Asn 
305 310 315 320 

Leu He Cys Ser Ala Leu Asp Leu Glu Cys Ala Ser Ser Asn Asp Gin 
325 330 335 

Thr Phe Thr His Ser Asp Met Arg Arg Thr Ala Arg Leu Leu Met Gin 
340 345 350 

Phe Leu Pro Gly Thr Asp Phe He Ser Ser Gly Tyr Ser Ala Val Pro 
355 360 365 

Asa Tyr Asp Asn Met Phe Ala Gly Ser Asn Glu Asp Ala Glu Asp Phe 
370 375 380 

Asp Asp Tyr Asn Val He Gin Arg Asp Leu Lys Val Asp Gly Gly Leu 
385 390 395 «00 

Arg Pro Val Arg Glu Glu Asp Val He Ala He Arg Asn Lys Ala Ala 
405 410 415 

Arg Ala Leu Gin Ala Val Phe Ala Gly Met Gly Leu Pro Pro He Thr 
420 425 430 

Asp Glu Glu Val Glu Ala Ala Thr Tyr Ala His Gly Ser Lys Asp Met 
435 4 *0 445 

Pro Glu Arg Asn He Val Glu Asp He Lys Phe Ala Gin Glu He He 
450 455 4 «° 

Asn Lys Asn Arg Asn Gly Leu Glu Val val Lys Ala Leu Ala Lys Gly 

470 475 4BU 



465 

Gly Phe Pro Asp Val Ala Gin Asp Met Leu Asn He Gin Lys Ala Lys 
485 490 495 
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Leu Thr Gly Asp Tyr Leu His Thr Ser Ala lie lie Val Gly Glu Gly 
500 505 510 

Gin Val Leu Ser Ala Val Asn Asp Val Asn Asp Tyr Ala Gly Pro Ala 
515 520 525 

Thr Gly Tyr Arg Leu Gin Gly Glu Arg Trp Glu Glu He Lys Asn He 
530 535 540 

Pro Gly Ala Leu Asp Pro Asn Glu Leu Gly . 
545 550 

(2) INFORMATION FOR SEQ ID NO: 6: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 555 amino aolds 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE: peptide 

(ill) HYPOTHETICAL; NO 

(3d) SEQUENCE DESCRIPTION: SEQ XD NO: 6: 

Met Arg Arg Ser Lys Arg Pne Glu Val Leu Ala Gin Arg Pro Val Asn 
1 5 10 15 

Gin Asp Gly Leu lie Gly Glu Trp Pro Glu Glu Gly Leu lie Ala Met 
20 25 30 

Glu Ser Pro Tyr Asp Pro Ala Ser Ser val Lys Val Glu Asn Gly Arg 
35 ^O 45 

lie Val Glu Leu Asp Gly Lys Ser Arg Ala Glu Phe Asp Wet He Asp 
. 50 55 60 

Arg Phe He Ala Asp Tyr Ala lie Asn Val Pro Glu Ala Glu Arg Ala 
65 70 

Mat Gin Leu Asp Ala Leu Glu lie Ala Arg Met Leu Val Asp lie His 
85 90 95 

Val Ser Arg Glu Glu lie lie Ala He Thr Thr Ala He Thr Pro Ala 
100 105 H° 

Lys Arg Leu Glu Val Met Ala Gin Met Asn Val val Glu Met Met Met 
U5 120 125 

Ala Leu Gin Lys Met Arg Ala Arg Arg Thr Pro Ser Asn Gin Cys His 
130 135 140 

Val Thr Asn Leu Lys Asp Asn Pro val Gin lie Ala Ala Asp Ala Ala 



145 



ISO I" 



Glu Ala Gly He Arg Gly Phe Ser Glu Gin Glu Thr Thr Val Gly He 
165 1 7 0 I 75 
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Ala Arg Tyr Ala Pro Phe Asn Ala Leu Ala Leu Leu Val Gly Ser Gin 
180 185 190 

Cys Gly Ala Pro Gly Val Leu Thr Gin Cys Ser Val Glu Glu Ala Thr 
195 200 205 

Glu Leu Glu Leu Gly Met Arg Gly Leu Thr Ser Tyr Ala Glu Thr Val 
210 215 220 

Ser Val Tyr Gly Thr Glu Ser Val Phe Thr Asp Gly Asp Asp Thr Pro 
225 230 235 240 

Trp Ser Lys Ala Phe Leu Ala Ser Ala Tyr Ala Ser Arg Gly Leu Lys 
245 250 255 

Met Arg Tyr Thr Ser Gly Thr Gly Ser Glu Ala Leu Met Gly Tyr Ser 
260 265 270 

Glu Ser Lys Ser Met Leu Tyr Leu Glu Ser Arg Cys lie Phe He Thr 
275 280 285 

Lys -Gly Ala Gly Val Gin Gly Leu Gin Asn Gly Ala Val Ser Cys He 
290 295 300 

Gly Met Thr Gly Ala Val Pro Ser Gly He Arg Ala Val Leu Ala Glu 
305 310 315 320 

Asn Leu He Ala Ser Met Leu Asp Leu Glu Val Ala Ser Ala Asn Asp 
325 330 335 

Gin Thr Phe Ser His Ser Asp He Arg Arg Thr Ala Arg Thr Leu Met 
340 345 350 

Gin Met Leu Pro Gly Thr Asp Phe He Phe Ser Gly Tyr Ser Ala Val 
355 360 365 

Pro Asn Tyr Asp Asn Met Phe Ala Gly Ser Asn Phe Asp Ala Glu Asp 
370 375 380 

Phe Asp Asp Tyr Asn He Leu Gin Arg Asp Leu Met Val Asp Gly Gly 
3S5 390 395 400 

Leu Arg Pro Val Thr Glu Glu Glu Thr He Ala He Arg Asn Lys Ala 
405 410 415 

Ala Arg Ala He Gin Ala Val Phe Arg Glu Leu Gly Leu Pro Leu lie 
420 425 430 

Ser Asp Glu Glu Val Asp Ala Ala Thr Tyr Ala His Gly Ser Lys Asp 
435 440 4 45 

Met Pro Ala Arg Asn Val Val Glu Asp Leu Ala Ala Val Glu Glu Met 
450 455 460 

Met Lys Arg Asn He Thr Gly Leu Asp He Val Gly Ala Leu Ser Ser 
465 470 475 «0 
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Ser Gly Ptte Glu Asp He Ala Ser Asn lie Leu Asn Met Leu Arg Gin 
485 490 495 

Arg Val Thr Gly Asp Tyr Leu Gin Thr Ser Ala lie Leu Asp Arg Gin 
500 505 510 

Phe Asp val Val Ser Ala Val Asn Asp He Asn Asp Tyr Gin Gly Pro 
515 520 525 

Gly Thr Gly Tyr Arg He Ser Ala Glu Arg Trp Ala Glu He Lys Asn 
530 535 540 

He Ala Gly Val Val Gin Pro Gly Ser He Glu 
545 550 555 

<2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY ; unknown 

(ii) MOLECULE TYPE I peptide 

(lii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Aro Ala Val Leu Ala Glu Asn Leu He Ala Ser Met Leu Asp Leu Glu 
1 5 10 15 

Val Ala Ser Ala Asn Asp Gin Thr Phe Ser His Ser Asp He Arg Arg 
20 25 30 

Thr Ala Arg Thr Leu Met Gin Met Leu Pro Gly Thr Asp Phe He Phe 
35 40 45 

Ser Gly Tyr Ser Ala Val Pro Asn Tyr Asp Asn Met Phe Ala Gly Ser 
50 55 60 

Asn Phe Asp Ala Glu Asp Phe Asp Asp Tyr Asn He Leu Gin Arg Asp 
65 70 " 80 

Leu Met Val Asp Gly Gly Leu Arg Pro Val Thr Glu Ala Glu Thr He 
B5 90 95 

Ala He Arg Gin Lys Ala Ala Arg Ala He Gin Ala Val Phe Arg Glu 
100 105 11 

Leu Gly Leu Pro Pro He Ala Asp Glu Glu Val Glu Ala Ala Thr Tyr 
1X5 120 125 

Ala Gin Gly 
130 
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<2) INFORMATION FOR SEQ ID NO:B: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: ISO amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 

<iii) HYPOTHETICAL: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO; 8: 

Val Ser Cys He Gly Met Tlir Gly Ala Val Pro Ser Gly Ho Arg Ala 
1 5 10 15 

Val Leu Ala Glu Asn Leu He Ala Ser Met Leu Asp Leu Glu Val Ala 

20 25 30 

Ser Ala Asn Asp Gin Thr Phe Ser His Sex Asp He Arg Arg Thr Ala 
35 40 45 

Arg Thr Leu Met Gin Met Leu Pro Gly Thr Asp Phe He Phe Ser Gly 
50 55 «0 

Tyr Ser Ala Val Pro Aan Tyr Asp Asn Mat Phe Ala Gly Ser Asn Phe 
65 70 75 80 

ASP Ala Glu Asp Phe Asp Asp Tyr Asn He Leu Gin Arg Asp Leu Met 
65 90 95 

Val Asp Gly Gly Leu Arg Pro Val Thr Glu Glu Glu Thr He Ala He 
100 105 HO 

Arg Asn Lys Ala Ala Arg Ala He Gin Ala Val Phe Arg Glu Leu Gly 
115 120 I 25 

Leu Pro Leu He Ser Asp Glu Glu Val Asp Ala Ala Thr Tyr Ala His 
130 135 l^O 

Gly ser Lys Asp Met Pro 
145 ~ 150 
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WHAT IS CLAIMED IS: ' 

1. A cosmid comprising a DNA fragment of about 
35 kb isolated from Klehsiella pneumoniae wherein said 
fragment encodes an active diol dehydratase enzyme 
5 having the restriction digest in Figure 5, columns 
numbered 4 . 

2* A transformed microorganism comprising a host 
microorganism and the cosmid of Claim 1 • 

3. The transformed microorganism of Claim 2 
10 wherein the host microorganism is E. coli, and which 

is deposited with the American Type Culture Collection 
as accession number ATCC 69790, 

4. The cosmid of Claim 1 which when transformed 
into bacteria causes metabolism of glycerol to 

15 1, 3-propanediol. 

5. A transformed microorganism comprising a host 
microorganism and a DNA fragment of the cosmid of 
Claim 1, said fragment encoding an active functional 
protein. 

20 6. A DNA fragment comprising a gene encoding a 

diol dehydratase enzyme, said gene encompassed by the 
cosmid of Claim 1 . 

7 . A gene encoding an active diol dehydratase 
enzyme having the DNA sequence as listed in SEQ ID 

25 NO:l. 

8. A gene encoding an active alcohol 
dehydrogenase having the DNA sequence as listed in SEQ 
ID NO: 2. 

9. A transformed microorganism comprising a host 
30 microorganism and the DNA sequence of Claim 7 or 

Claim 8. 

10, A transformed microorganism comprising 
E. coli DH5(X and the DNA sequence of Claim 7 or 
Claim 8. 

35 11. A process comprising the bioconversion of a 

carbon substrate by transforming a microbial host with 
genes capable of expressing a diol dehydratase and 
contacting said transformed host with said substrate. 
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12. A process comprising the bioconversion of a 
carbon substrate by transforming a microbial host with 
genes derived from a cosmid comprising a fragment of 
about 35 kb isolated from Klebsiella pneumoniae 

5 wherein said genes encode an active diol dehydratase 
enzyme and any other functional bacterial protein 
encoded by said cosmid, and contacting said 
transformed host with said substrate ♦ 

13. The process of Claim 12 wherein said other 
10 functional bacterial protein is an alcohol dehydro- 
genase. 

14. The process of Claim 11 or 12 wherein the 
carbon substrate is selected from the group consisting 
of ethyleneglycolr lr 2-propanediol, glycerol and 

15 2 , 3-butanediol . 

15. The process of Claim 14 wherein the carbon 
substrate is glycerol. 

16. The process of Claim 15 wherein the glycerol 
is converted to 1, 3 -propanediol. 

20 17. The process of Claim 11 or 12 wherein the 

microbial host is selected from the group consisting 
of members of the genera Eschericia, Bacillvs f 
Klebsiella, Citrobacter, Saccharomyces r Clostridium 
and Pichia. 

25 18. The process of Claim 17 wherein the microbial 

host is selected from the group consisting of members 
of species E. coli f Bacillus subtilis, Bacillus 
licheniformis and Pichia pastoris • 

19. The process of Claim 18 wherein the microbial 

30 host is E. coli. 

20. The process of Claim 11 or 12 wherein the 
genes are diol dehydratase genes isolated from the 
group consisting of members of the genera 
Klebsiella sp., Clostridia sp., Salmonella sp. and 

35 Citrobacter sp. 

21. The process of Claim 16 wherein said 
transformed host is recombinant K. coli DH5a 
containing a gene encoding diol dehydratase enzyme 
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wherein said gene comprises the DNA sequence of SEQ ID 
N0;1. 

22. The process of Claim 20 wherein said 
transformed host is recombinant E. coli DH5a 

5 containing a gene encoding diol dehydratase enzyme 

wherein said gene comprises the DNA sequence of SEQ ID 
NO:l. 

23, The product of the process of Claim 11. or 12. 
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